Synopsis: ID Required

Get introduced to another antipattern that specifies the importance of using specific keys in the database.

We'll cover the following

Objective: Establish primary key conventions
Legitimate uses of the antipattern

Recently, a software developer asked me a question about preventing duplicate rows. It’s a question I get asked frequently and the first response that comes to mind is to ask the developer to check if she lacks a primary key. But in this case, this wasn’t the problem.

The developer in this case had stored articles for publishing on a website in his content management database. He had used an intersection table for a many-to-many association between a table of articles and a table of tags.

Enter to Rename, Shift+Enter to Preview

Creating ArticleTags table

He was getting incorrect results from queries when counting the number of articles with a given tag. He knew that there was only one article with a particular tag, but the query was telling him that there were three.

Enter to Rename, Shift+Enter to Preview

Counting the number of articles with a given tag

When he queried all the rows matching that tag_id, he saw that the tag was associated with one particular article thrice; three rows showed the same association, although they had different values for id.

Enter to Rename, Shift+Enter to Preview

Retrieving the content of the ArticleTags table

After seeing the details, we came to know that the values for article_id and tag_id are repeating three times.

Do I really need a primary key?

This table did have a primary key. However, that primary key wasn’t preventing duplicates in the columns that mattered. One remedy might be to create a UNIQUE constraint over the other two columns, but, even then, the question remains:, why is the id column needed at all?

Objective: Establish primary key conventions#

The objective is to make sure that every table has a primary key, but confusion about the nature of a primary key has resulted in an antipattern.

Everyone who has been introduced to database design knows that a primary key is an important, even mandatory, part of a table. This is true; primary keys are integral to good database design. A primary key is guaranteed to be unique over all rows in the table, so this is the logical mechanism to address individual rows and to prevent duplicate rows from being stored. A primary key is also referenced by foreign keys to create table associations.

The tricky part is choosing a column to serve as the primary key. The value of any attribute in most tables has the potential to belong in more than one row. Textbook examples such as a person’s first name and last name are clearly subject to having duplicates. Even email addresses or administrative identification numbers such as a United States Social Security number or taxpayer ID number aren’t strictly unique. A new column is needed in such tables to store an artificial value that has no meaning in the domain modeled by the table. This column is used as the primary key, so we can address rows uniquely while allowing any other attribute column to contain duplicates if it is needed. This type of primary key column is sometimes called a pseudo key or a surrogate key.

In order to make sure that rows can be given unique pseudo key values (even when concurrent clients are inserting new rows), most databases provide a mechanism to generate unique integer values serially outside the scope of transaction isolation.

Pseudo keys weren’t standardized until SQL:2003, so each database uses its own extension to SQL to implement them. Even the terminology for pseudo keys is vendor-dependent, as shown by the following table:

Features	Supported by Database Brands
`AUTO_INCREMENT`	MySQL
`GENERATOR`	Firebird, InterBase
`IDENTITY`	DB2, Derby, Microsoft SQL Server, Sybase
`ROWID`	SQLite
`SEQUENCE`	DB2, Firebird, Informix, Ingres, Oracle, PostgreSQL
`SERIAL`	MySQL, PostgreSQL

Pseudo keys are a useful feature, but they aren’t the only solution for declaring a primary key.

Legitimate uses of the antipattern#

Some object-relational frameworks simplify development by assuming convention over configuration. These frameworks expect every table to define its primary key in the same way — as an integer pseudo key column named id. It is best to conform to these conventions when we are using such a framework because this gives us access to other desirable features of the framework.

There’s also nothing wrong with using a pseudo key or assigning values from an auto-incrementing integer mechanism. But not every table really needs a pseudo key, and it’s not necessary to name every pseudo key id.

A pseudo key is a good choice as a surrogate for a natural key that’s too long to be practical. For example, for a table that records attributes of a file on the filesystem, the path of the file might be a good natural key, but it would be costly to index a string column that long.

Comparison of Different Tree Implementations

Antipattern: One Size Fits All

Mark as Completed

Report an Issue

Introduction

Logical Antipattern - Jaywalking

Logical Antipattern - Naive Trees

Logical Antipattern - ID Required

Logical Antipattern - Keyless Entry

Assessment: Logical Antipatterns - Part 1

Logical Antipattern - Entity-Attribute-Value

Logical Antipattern - Polymorphic Associations

Logical Antipattern - Multicolumn Attributes

Logical Antipattern - Metadata Tribbles

Assessment: Logical Antipatterns - Part 2

Physical Antipattern - Rounding Errors

Physical Antipattern - 31 Flavors

Physical Antipattern - Phantom Files

Physical Antipattern - Index Shotgun

Assessment: Physical Antipatterns

Query Antipattern - Fear of the Unknown

Query Antipattern - Ambiguous Groups

Query Antipattern - Random Selection

Query Antipattern - Poor Man’s Search Engine

Query Antipattern - Spaghetti Query

Query Antipattern - Implicit Columns

Assessment: Query Antipattern

Application Development Antipattern - Readable Passwords

Application Development Antipattern - SQL Injection

Application Development Antipattern - Pseudokey Neat-Freak

Application Development Antipattern - See No Evil

Application Development Antipattern - Diplomatic Immunity

Application Development Antipattern - Magic Beans

Assessment: Application Development Antipattern

Conclusion

Synopsis: ID Required

Objective: Establish primary key conventions#

Legitimate uses of the antipattern#